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Period for Reply 

A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1 .136(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

- tf the period for reply specified above is less than thirty (30) days, a reply within the statutory minimum of thirty (30) days will be considered timely. 

- If NO period for reply is specified above, the maximum statutory period will apply and will expire SIX (6) MONTHS from the mailing date of this communication. 

- Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANDONED (35 U.S.C. § 133). 

- Any reply received by the Office later than three months after the mailing date of this communication, even If timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1.704(b). 

Status 

1)13 Responsive to communication(s) filed on 30 December 2003 . 
2b)M This action is FINAL. 2b)n Ttiis action is non-final. 

3) n Since this application is in con(jition for allowance except for fomrial matters, prosecution as to the merits is 

closed in accordance with the practice under £x parte Quayle, 1935 CD. 11, 453 O.G. 213. 
Disposition of Claims 

4) 13 Claim(s) 1-26 is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration, 

5) 0 Claim(s) is/are allowed. 

6) 13 Claim(s) 1-26 is/are rejected. 
?)□ Claim(s) is/are objected to. 

8) n Claim(s) . are subject to restriction and/or election requirement. 

Application Papers 

9) 0 The specification is objected to by the Examiner. 

10) n The drawing(s) filed on is/are: a)[J accepted or b)\Z\ objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held In abeyance. See 37 CFR 1.85(a). 

1 1) 0 The proposed drawing correction filed on is: a)n approved b)n disapproved by the Examiner. 

if approved, corrected drawings are required in reply to this Office action. 

12) 0 The oath or declaration is objected to by the Examiner. 
Priority under 35 U.S.C. §§119 and 120 

13) n Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 119(a)-(d) or (f). 

a)nAII b)n Some*c)n None of: 

1 .□ Certified copies of the priority documents have been received. 

2. n Certified copies of the priority documents have been received in Application No. . 

3. n Copies of the certified copies of the priority documents have been received in this National Stage 

application from the International Bureau (POT Rule 17.2(a)). 
* See the attached detailed Office action for a list of the certified copies not received. 

14) 0 Acknowledgment is made of a claim for domestic priority under 35 U.S.C. § 1 19(e) (to a provisional application). 

a) □ The translation of the foreign language provisional application has been received. 

15) 0 Acknowledgment is made of a claim for domestic priority under 35 U.S.C. §§ 120 and/or 121 . 

Attach ment(s) 

1) CD Notice of References Cited (PTO-892) 4) □ Interview Summary (PTO-413) Paper No(s). . 

2) CH Notice of Draftsperson's Patent Drawing Review (PTO-948) 5) EH Notice of Informal Patent Application (PTO-1 52) 

3) O Information Disclosure Statement(s) (PTO-1449) Paper No(s) . 6) □ Other: 

U.S. Patent and Trademartt Office 

PTO-326 (Rev. 04-01 ) Office Action Summary Part of Paper No. 6 
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DETAILED ACTION 

1 • This action is responsive to communications: The Amendment forwarded to the examiner 
on 12/30/03 to the original Application and Information Disclosure Statement filed on 02/24/00. 

2. Claims 1-9 and 12-24 remain rejected under 35 U.S,C. 103(a) as being unpatentable over 
Brown et al (US: 5,913,208, 06/15/99). 

3. Claims 10-1 1 and 25-26 remain rejected under 35 U.S.C. 103(a) as being unpatentable 
over Brown et al (US: 5,913,208, 06/15/99) in view of Microsoft Press Computer Dictionary, 
Microsoft Press, 1997, pp. 309. 

Claim Rejections - 35 USC §103 

4. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentabihty shall not be negatived by the 
manner in which the invention was made. 

5. Claims 1-9 and 12-24 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Brown et al (US: 5,913,208, 06/15/99). 



-As per independent claim 1, Brown et al teach a method for identifying dupHcate 
documents, where having received said documents generating a hit-hst (Fig. 3b) of intrinsic and 
non intrinsic attributes (title, author, creation date, length, size, location, abstract, etc (columns 1 
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and 6, lines 64-67 and 51-54)) wherein the table structure (equivalent to metadata summary sub- 
trees) of the attributes of each document pair are compared (i.e. if one document is missing an 
attribute structure then the documents are considered distinct) and each document is shown to be 
distinct from the other if their respective attributes aren't equal (Fig. 4: 430 & 450). Brown et al 
do not disclose that the summary hit-hst contain metadata. It was well known in the art that 
document attributes such as title, size, location, etc are considered metadata of a document. 
Brown et al also do not disclose that the summary hit-hst includes a sub-tree structure. It would 
have been obvious to one of ordinary skill in the art, to have used Brown et al method of 
comparing hit-hsts of attributes and used the structural relations of those hit-list attributes as the 
first determinate of distinct documents, because while both methods provide the same benefit of 
identifying distinct documents, by checking the stracture (layout) of those attributes first, it can 
more quickly be established with the minimum amount of comparisons that the compared 
documents are distinct of one another (i.e. both documents have structures A & B, but since one 
has A before B and the other B before A they are considered distinct.) 

-As per dependent claim 2, Brown et al further disclose wherein the table structure 
contains intrinsic and non intrinsic attributes (title, author, creation date, length, size, location, 
abstract, etc (columns 1 and 6, lines 64-67 and 51-54)) to be compared. 

-As per dependent claim 3, Brown et al fiirther disclose that one of the attributes used for 
determination could be an abstract or a title (text-content)(column 6, lines 51-54). 
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-As per dependent claim 4, Brown et al further disclose and that one of the attributes used 
for determination could be an abstract or a title (text-content)(column 6, lines 51-54). 

-As per dependent claim 5, Brown et al further disclose noting the current pair of hit-list 
documents as duplicates if the text content are equal (i.e. the structure and non-text attributes 
have been compared but alone can not determine duplicates) (Fig. 4: 460). 

-As per dependent claim 6, Brown et al further disclose one embodiment of deleting 
duplicates from the hit Ust group (column 8, lines 20-22: Fig. 8). 

-As per dependent claims 7-9, Brown et al further disclose that in the hit-hst (metadata 

table) when an entry is identified as having one or more dupUcates is cross referenced in the 

duplicate identifier field of the other duplicate (column 8, lines 11-16), wherein for two metadata 
entries the entry number and duplicate identifier field constitute a 2x2 matrix (Fig. 3b). Brown 
et al also teach that the process of identifying duplicates as distinct is achieved by comparing the 
metadata structures, attributes, and text content of the attributes as seen in claims 1-3. Brown et 
al do not teach storing a zero binary value in the first row and second column position of the 
matrix to designate that the first and second documents are distinct. It would have been obvious 
to one of ordinary skill in the art, to have used a binary indicator (0 or 1) in Brown et al duplicate 
identifier position concerning only two documents, because while Brown et al is set up for a 
plurality of documents which could lead to many multiple dependencies, if a restriction of only 
comparing two documents was applied to Brown et al, there would be no need for more than one 
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indicator (0 or 1) to clearly state the relationship between the first document and second 
document. 

-As per independent claim 12, Brown et al teach a system wherein a ^earcA engine 
process produces a hit-list (metadata parser) (Abstract), where it has aheady been stated above 
that the hit-list contains metadata (i.e. structure, attributes, and some text content), wherein a hit- 

formatter processor selects one or more of. duplicate instances of one another (summary 

consolidator)(Abstract) with the ability of the formatter processor to delete dupUcates from he hit 
list group (repository) (column 8, lines 20-22: Fig. 8). Brown et al also teach wherein the 
generated summary hit-lists are stored (in a repository) (Fig. 1 and Fig. 4: 410) 

-As per dependent claim 13, Brown et al further teach that the formatter processor 
(summary consolidator)(Fig.4: 420 & 440) is configured to compare the metadata attributes 
values of the hit-lists (metadata summaries) and compare the text content (i.e. abstract or title) 
included in the metadata attribute hit-lists. As shown in claim 1, Brown et al does not show that 
its formatter is configured to compare the structures of the metadata hit lists. It would have been 
obvious to one of ordinary skill in the art, to have used Brown et al method of comparing hit-Hsts 
of attributes and used the structural relations of those hit-Ust attributes as the first determinate of 
distinct documents, because while both methods provide the same benefit of identifying distinct 
documents, by checking the structure (layout) of those attributes first, it can more quickly be 
established with the minimum amount of comparisons that the compared documents are distinct 
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of one another (i.e. both documents have structures A & B, but since one has A before B and the 
other B before A they are considered distinct.) 

-As per dependent claims 14-16, Brown et al further teach wherein the system is 
configured to compare the attribute values included in the hit-lists, the text content included in 
the metadata hit- lists, and the metadata portion of the hit-Ust. 

-As per independent claim 17 and dependent claims 18 and 19, Brown et al teach the 
method of classifying electronically posted documents of claim 1-3 respectively. Brown et al 
does not teach that the method is a program product to be executed by a computer system stored 
in a conq^uter readable medium. It was well known in the art to convert a method for use by a 
computer system to a program product so that that method can be executed on said system and 
any other systems that can execute the program. It was also well known in the art that a 
computer system for executing said program product would have a recordable media (memory). 

-As per dependent claim 20, Brown et al further teach identifying documents as 
dupUcates if text content of the metadata hit-lists (i.e. abstract or title) are equivalent (i.e. the 
structure and non-text attributes have been compared but alone can not determine dupUcates) 
(Fig. 4: 460). 

-As per dependent claim 21, Brown et al further teach removing the duplicate metadata 
hit-list from the group (column 8, lines 20-22; Fig. 8). 
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-As per dependent claims 22-24, Brown et al further teach the hit- list (metadata table) when an 

entry is identified as having one or more duplicates is cross referenced in the duplicate 

identifier field of the other duplicate (column 8, lines 11-16), wherein for two metadata entries 
the entry number and duplicate identifier field constitute a 2x2 matrix (Fig. 3b). Brown et al also 
teach that the process of identifying duplicates as distinct is achieved by comparing the metadata 
structures, attributes, and text content of the attributes as seen in claims 1-3. Brown et al do not 
teach storing a zero binary value in the first row and second column position of the matrix to 
designate that the first and second documents are distinct. It would have been obvious to one of 
ordinary skill in the art, to have used a binary indicator (0 or 1) in Brown et al duplicate identifier 
position concerning only two documents, because while Brown et al is set up for a plurality of 
documents which could lead to many multiple dependencies, if a restriction of only comparing 
two documents was applied to Brown et al, there would be no need for more than one indicator 
(0 or 1) to clearly state the relationship between the first document and second document. 

6. Claims 10-1 1 and 25-26 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Brown et al (US: 5,913,208, 06/15/99) in view of Microsoft Press Computer Dictionary, 
Microsoft Press, 1997, pp. 309. 

-As per independent claim 10, Brown et al teach the method of claim 1 as stated above, 
and fiirther disclose receiving not just two, but two or more entries (column 3, lines 54-55) of 
documents and generating metadata hit-Ust summaries for each received document. Brown et al 
do not teach that when receiving a plurality of documents to group the metadata hit-list 



Application/Control Number: 09/5 1 3,058 Page 8 

Art Unit: 2178 

summaries according to the mime-type designation of each document. The Microsoft Press • 
Computer Dictionary teaches us that MIME-types describe the contents of a document which can 
be used to interpret the content of a file over the intemet (pp. 309). It would have been obvious 
to one of ordinary skill in the art, to have used Brown et al method for identifying duplicate 
documents from search results without comparing document content and grouping the hit-Ust 
summaries by MIME-type, because documents of different MIME-types wouldn't have to be 
compared as their intrinsic attributes would be wholly different, i.e. a regular text/plain MIME- 
type couldn't be a duplicate of a text/html MIME-type, and thus this would increase efficiency 
by significantly reducing the number of hit- list summary comparisons. 

-As per dependent claim 1 1 . Brown et al and the Microsoft Press Computer Dictionary 
do not teach grouping more subsets of hit- list summaries by MIME-type. As it was taught above 
in claim 10, it would have been obvious to have grouped as many MIME-type subsets as were 
required by the plurality of documents for the purpose of increased efficiency by significantly 
reducing the number of hit-list summary comparisons across MIME-types. 

-As per independent claim 25, Brown et al teach the method for classifying electronically 
posted documents of claim 10 as stated above. Brown et al do not teach that the method is a 
program product to be executed by a computer system stored in a computer readable medium. It 
was well known in the art to convert a method for use by a computer system to a program 
product so that that method can be executed on said system and any other systems that can 
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execute the program. It was also well known in the art that a computer system for executing 
said program product would have a recordable media (memory). 

-As per dependent claim 26. Brown et al does not teach grouping more subsets of hit-hst 
summaries by MIME-type. As it was taught above in claim 25, it would have been obvious to 
have grouped as many MIME-type subsets as were required by the plurality of documents for the 
purpose of increased efficiency by significantly reducing the number of hit- list summary 
comparisons across MIME-types. 

Response to Arguments 

7. Applicant's arguments filed 12/30/03 have been fully considered but they are not 
persuasive. 

-In regard to independent claims 1, 10 12, 17, and 25, the Applicant argues that the 
Brown et al reference does not disclose generating a metadata summary for a first document and 
a second document, wherein each metadata summary includes a sub-tree and each sub-tree 
including a pluraHty of list items, comparing the list items of the summary sub-trees and 
identifying the two documents as distinct if the list items of the summary sub-trees are not 
equivalent. As stated above in the rejection of said independent claims, Brown et al teach 
generating a metadata summary for a plurality of documents (hit-list: Fig. 3b) of intrinsic and 
non intrinsic attributes (title, author, creation date, length, size, location, abstract, etc (columns 1 
and 6, lines 64-67 and 51-54)) wherein the row entries in the hit-list table structure are equivalent 
to metadata summary sub-trees for individual documents, and wherein each document row (sub- 
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tree) includes a plurality of intrinsic and non- intrinsic attributes (equivalent to list-items). Brown 
et al also teach wherein the attributes of each document pair are compared (i.e. if one document 
is missing an attribute structure then the documents are considered distinct) and each document 
is shown to be distinct from the other if their respective attributes aren't equal (Fig. 4: 430 & 
450). 

-In regard to all the dependent claims, the Examiner notes that as the independent 
claims remain rejected as shown above, the dependent claims do not distinguish over the prior art 
and their original rejections are still applied. 



Conclusion 

8. THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time 
policy as set forth in 37 CFR 1. 136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within TWO 
MONTHS of the mailing date of this final action and the advisory action is not mailed until after 
the end of the THREE-MONTH shortened statutory period, then the shortened statutory period 
will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 
CFR 1 .136(a) will be calculated from the mailing date of the advisory action. In no event, 
however, will the statutory period for reply expire later than SIX MONTHS from the mailing 
date of this final action. 
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9. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Adam L Basehoar whose telephone number is (703) 305-7212. 
The examiner can normally be reached on M-F: 7:30am - 4:00pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Heather Hemdon can be reached on (703) 308-5186. The fax phone number for the 
organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
appHcations is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 



